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Abstract 

Hidden services were deployed on the Tor 
anonymous communication network in 2004. An- 
nounced properties include server resistance to dis- 
tributed DoS. Both the EFF and Reporters Without 
Borders have issued guides that describe using hid- 
den services via Tor to protect the safety of dissi- 
dents as well as to resist censorship. 

We present fast and cheap attacks that reveal the lo- 
cation of a hidden sen’er. Using a single hostile Tor 
node we have located deployed hidden servers in 
a matter of minutes. Although we examine hidden 
sendees over Tor, our results apply to any client us- 
ing a variety of anonymity networks. In fact, these 
are the first actual intersection attacks on any de- 
ployed public network: thus confirming general ex- 
pectations from prior theory and simulation. 

We recommend changes to route selection design 
and implementation for Tor. These changes require 
no operational increase in network overhead and 
are simple to make; but they prevent the attacks we 
have demonstrated. They have been implemented. 


1 Introduction 

Tor is a distributed low-latency anonymous 
communication network developed by the Naval 
Research Laboratory and the Free Haven Project. 
It is currently the largest anonymity network in ex- 
istence, with about 450 server nodes around the 


world at the time of writing. It is popular and highly 
recommended: it was rated one of the hundred best 
products of 2005 by PC World. Since 2004 Tor has 
also been used to underly services offered from hid- 
den locations. These were introduced fin as resis- 
tant to distributed DoS since they were designed 
to require a DDoS attack on the entire Tor net- 
work in order to attack a hidden server. Hidden 
servers have also been recommended for preserving 
the anonymity of the service offerer and to resist 
censorship. Specifically Undergroundmedia.org 
has published a guide to “Torcasting” (anonymity- 
preserving and censorship-resistant podcasting). 
And both the Electronic Frontier Foundation and 
Reporters Without Borders have issued guides that 
describe using hidden services via Tor to protect the 
safety of dissidents as well as to resist censorship. 
There have been several recent cases in the news in 
which anonymous bloggers have or have not been 
exposed and have or have not lost jobs, etc., as a re- 
sult, depending on the policy of their ISP, the inter- 
pretation of laws by various courts, and numerous 
other factors. Recommendations for a technology 
to protect anonymous bloggers and other publish- 
ers, regardless of legal protection, would thus seem 
to be timely and encouraging. 

The Tor developers are careful, however, to warn 
against using Tor in critical situations: upon startup 
the Tor client announces, “This is experimental 
software. Do not rely on it for strong anonymity.” 
Nonetheless, with increasing high-profile recom- 
mendations to use Tor’s hidden services for appli- 
cations such as those above, it is important to assess 
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the protection they afford. In this paper we demon- 
strate attacks (not simulations) on the deployed Tor 
network that reveal the location of a hidden server. 
The attacks are cheap and fast: they use only a sin- 
gle hostile Tor node and require from only minutes 
to a few hours to locate a hidden server. 

Although we examined hidden services on Tor, 
our results are not limited in principle to either hid- 
den services or to Tor. They should apply to the 
hidden service design even if run on another under- 
lying anonymity network and should apply to other 
clients using an anonymity network, not just to hid- 
den servers. 

We believe that ours are the first attacks that lo- 
cate hidden servers, whether on hidden services on 
a deployed network or in simulation. Also, while 
there have been simulations and analytic results, we 
believe ours are the first published intersection at- 
tacks carried out on a deployed anonymity network. 

In Section |2] we review previous related work. 
In Section^] we describe the design of Tor’s hidden 
services. In Section 0] we present various attacks 
and the experimental results from running them. 
In Section 0 we describe various countermeasures 
that might be taken to our attacks and the effec- 
tiveness of them. We also describe an implementa- 
tion feature of Tor that our experiments uncovered 
and how to change it to better resist the attacks. 
In Section |3 we conclude with recommendations 
for simple-to-implement design changes to hidden 
services. These also should not add to the number 
of hops or otherwise increase overhead to the de- 
sign, but they should resist our attacks. We have 
discussed both the implementation feature we un- 
covered and our recommended design changes with 
the Tor developers. As a result, the latest version of 
Tor is resistant to the attacks we present herein. Fi- 
nally, we discuss open problems and future work. 

2 Previous Work on Hiding Services 
and Anonymity 

The earliest reference we can find to a system 
that hides the location of a service from those using 
it is Ross Anderson’s Eternity Service Q- Therein 
it is suggested that servers hold encrypted files, and 


these files are to be accessed by anonymous com- 
munication to prevent uncovering of the location 
of a server from which the file is being retrieved. 
Early presentations of onion routing from the same 
era described the use of onion routing to hide the 
location of an automated classification downgrader 
so that users of the service would not be able to at- 
tack it. Earlier still, Roger Needham noted the fun- 
damental connection between anonymity and the 
inability to selectively deny service |[T9l[20l. which 
was one of the motivating ideas in the Eternity Ser- 
vice. The idea of hiding the location of a docu- 
ment (or encrypted fragment of a document) also 
underlies many censorship-resistant publishing de- 
signs such as Free Haven ED and Tangier l28l . 

Anonymous communication networks were in- 
troduced by David Chaum 0. He described a net- 
work that distributes trust across multiple nodes 
that carry the communication. The design is of 
a public-key-based, high-latency anonymous com- 
munication network such as might be appropriate 
for email. It is not for use in bidirectional, low- 
latency communication, such as web traffic, chat, 
remote login, etc. Low-latency communication 
anonymity was introduced for ISDN <221 , but made 
to anonymize within a group of users exchanging 
fixed and equal bandwidth with a local telephone 
switch rather than anonymizing within an Internet- 
wide group with diverse bandwidth needs such as 
occur in the just mentioned applications. The oldest 
anonymous communication system for web traffic 
is probably the Anonymizer (4|- Unlike the Chaum 
design, all traffic passes through a single proxy, 
making it a single point of failure and/or attack in 
many ways. Also unlike Chaum mixes, it does not 
actually delay and mix traffic. Traffic is processed 
FIFO. The Anonymizer is also probably one of 
the most widely used anonymization systems: they 
claim to have millions of users. 

The first published, as well as the first deployed, 
distributed system for low-latency anonymous In- 
ternet communication was onion routing m in 
1996, followed by the Freedom Network |8| from 
1999 to 2001 . The current version of onion routing. 
Tor 1131 . was deployed in late 2003, and hidden ser- 
vices using Tor were deployed in early 2004. 
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All of these low-latency anonymity systems 
work by proxying communication through multi- 
ple hops; at each hop the communication changes 
its appearance by adding or removing a layer of en- 
cryption (depending on whether it is traveling from 
the circuit originator to responder or vice versa). 
They all use public key cryptography to distribute 
session keys to the nodes along a route, thus es- 
tablishing a circuit. Each session key is shared be- 
tween the circuit initiator (client) and the one node 
that was given the key in establishing the circuit. 
Data that passes along the circuit uses these ses- 
sion keys. Both Freedom and Tor have a default 
circuit length of three nodes. For more details con- 
sult the above cited work. The Java Anon Proxy 
(JAP)/Web MIXes 0 is another popular system for 
diffused low-latency anonymity. However, unlike 
the others mentioned here, it works by mixing and 
by diffusing only trust and jurisdiction. It does not 
hide where communication enters and leaves the 
network. All communication that enters at one lo- 
cation leaves together (now mixed) at another loca- 
tion. As such it is not directly amenable to the hid- 
den service design to be described presently. JAP 
has been deployed since 2000. 

Hidden services in Tor, as described in the next 
section and in ECS, rely on a rendezvous server, 
which mates anonymous circuits from two princi- 
pals so that each relies only on himself to build 
a secure circuit. The first published design for a 
rendezvous service was for anonymous ISDN tele- 
phony |[22j rather than Internet communication. As 
such it had very different assumptions and require- 
ments from the rendezvous servers we describe, 
some of which we have already noted above. A ren- 
dezvous server for IRC chat was mentioned in ffij) : 
however, the first detailed design for a rendezvous 
server for Internet communication was by Gold- 
berg 03). It differs in many ways from rendezvous 
servers as used by Tor’s hidden services, but we 
will not discuss Goldberg’s design further here. 

There is much literature on attacking anony- 
mous communication 0. Rather than single out 
any of it here, we cite the relevant prior literature 
at appropriate points below. The current paper is 
the first to focus specifically on attacks for locating 


hidden services. 

3 Location-hidden Services in Tor 

One of the major vulnerabilities for a hidden ser- 
vice in Tor is the server’s selection of the first and 
last node in the communication path. To a first ap- 
proximation, if an adversary can watch the edges of 
a Tor circuit, then she can confirm who is commu- 
nicating. This is because the low-latency require- 
ments make it easy to confirm the timing signature 
of traffic flowing (in both directions) over the cir- 
cuit. This is true whether the adversary controls 
the Tor nodes at the edges of the circuit or is just 
observing the links from those nodes to the initia- 
tor and responder. Actually, this vulnerability has 
always been alleged and assumed but never previ- 
ously demonstrated. A byproduct of our analysis of 
hidden services is that we experimentally corrobo- 
rate this traffic confirmation on Tor circuits. For 
hidden services, this means that the service is vul- 
nerable in every communication path it sets up with 
a client if a member of the path can determine it is 
being used by a hidden service and that it is the first 
node in the path. 

In order to see how our attacks that locate hid- 
den servers are done we need to describe how the 
hidden service communication works. Fig.Qshows 
a normal setup of this communication channel. 

In the current implementation of Tor, a con- 
nection to a hidden service involves five important 
nodes in addition to the nodes used for basic anony- 
mous communication over Tor. 

• HS, the Hidden Server offering some kind of 
(hidden) service to the users of the Tor net- 
work, e.g. web pages, mail accounts, login 
service, etc. 

• C, the client connecting to the Hidden Server. 

• DS, a Directory Server containing information 
about the Tor network nodes and used as the 
point of contact for information on where to 
contact hidden services. 

• RP, the Rendezvous Point is the only node in 
the data tunnel that is known to both sides. 
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Figure 1. Normal use of hidden ser- 
vices and rendezvous servers 


• IP, the Introduction Point where the Hidden 
Server is listening for connections to the hid- 
den service. 

A normal setup of communication between a 
client and a hidden service is done as shown 
in Fig. [2 All the displayed message flows are 
anonymized, i.e., they are routed through several 
anonymizing nodes on their path towards the other 
end, as described in Section |3 Every arrow and 
connection in the figure represents an anonymous 
channel consisting of at least two or more interme- 
diate nodes. (Hereafter, we use ‘node’ to refer ex- 
clusively to nodes of the underlying anonymization 
network, sometimes also called ‘server nodes’. Al- 
though we are considering the Tor network specif- 
ically, the setup would apply as well if some other 
anonymizing network were used to underly the 
hidden service protocol. The only exceptions are 
C and HS, which may be anonymization nodes 
or they may be merely clients external to the 
anonymization network.) 

First the Hidden Server connects (1) to a node 
in the Tor network and asks if it is OK for the node 
to act as an Introduction Point for his service. If 
the node accepts, we keep the circuit open and con- 
tinue; otherwise HS tries another node until suc- 
cessful. These connections are kept open forever, 
i.e., until one of the nodes restarts or decides to 


pull it downO Next, the Hidden Server contacts (2) 
the Directory Server and asks it to publish the con- 
tact information of its hidden service. The hidden 
service is now ready to receive connection requests 
from clients. 

In order to retrieve data from the service the 
Client connects (3) to DS and asks for the contact 
information of the identified service and retrieves 
it if it exists (including the addresses of Introduc- 
tion Points). There can be multiple Introduction 
Points per service. The Client then selects a node 
in the network to act as a Rendezvous Point, con- 
nects (4) to it and asks it to listen for connections 
from a hidden service on C’s behalf. The Client re- 
peats this until a Rendezvous Point has accepted, 
and then contacts (5) the Introduction Point and 
asks it to forward the information about the selected 
RP0 The Introduction Point forwards (6) this mes- 
sage to the Hidden Server who determines whether 
to connect to the Rendezvous Point or notQ If OK, 
the Hidden Server connects (7) to RP and asks to 
be connected to the waiting rendezvous circuit, and 
RP then forwards (8) this connection request to the 
Client. 

Now RP can start passing data between the two 
connections and the result is an anonymous data 
tunnel (9) from C to HS through RP. 

From this we observe the following facts about 
the nodes in the network: 

• C does not know the location (IP address) of 
HS, but knows the location of RP; 

• HS does not know the location of C, but knows 
the location of RP; 

• RP does not know the location of either C or 
HS, and he knows neither the service he is 
serving nor the content of the messages re- 
layed through him; 

• there are multiple (currently three) nodes be- 
tween HS and RP and two nodes between C 

1 In Tor, any node in a circuit can initiate a circuit teardown. 

-Optionally, this could include authentication information 
for the service to determine from whom to accept connections. 

'This flow is over the same anonymous circuit as (1), simi- 
larly for (4) and (8). 
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Figure 2. Vulnerable location of At- 
tacker in communication channel to 
the Hidden Server 


and RP, to hide traffic and create a degree of 
anonymity on both ends; and 

• any member of the network which claims to 
offer stability can be used by HS to form an 
anonymous tunnel to RP, including C if it is 
a node in the anonymization network. This is 
the basis of our attacks. 

4 Attacks and Experimental Results 

We have done experiments using multiple attack 
methods in order to determine the IP address of the 
Hidden Server. We will here first describe the setup 
of the experiment and then four attack methods. 
The attacks can be carried out by an adversary that 
controls merely a single node in the network. Since 
anyone can run a Tor node simply by volunteering, 
this is trivial. (In fact the adversary need only run 
a “middleman” node, which never lets circuits exit 
the anonymization network. The burden of running 
a middleman node is typically less than that of run- 
ning an exit node. Of the roughly 250 nodes in the 
Tor network at the time the experiments were done 
only about 100 allowed exit to port 80.) At the end 
we describe an accelerated attack using two com- 
promised nodes. 

Fig .|2]shows the scenarios that an attacker, here- 
after Alice, wants to achieve in connections to the 
Hidden Server. Alice controls the Client and one 
node. Her goal is to control Node 1 of the circuit. 
Certain circuits will yield a match of traffic pattern 
with what is expected given when C sends to and 
receives from HS. Alice will look for such pattern 


matches among all active circuits through the node 
she owns. If she finds a match, then her node has 
been made part of the circuit between the Hidden 
Server and the Rendezvous Point as Node 1, 2 or 3. 
From this she will be able to determine a few facts. 
First she will know when she has the node closest to 
RP (Node 3) since she knows RP’s IP address, and 
she can easily abandon the circuit and attack again. 
Second, if her node has an unknown IP address on 
both sides of the matching circuit, she knows she is 
either Node 1 connected directly to HS or Node 2 
in the circuit. This enables her to use timing or sta- 
tistical methods to determine her position as will be 
described later. 

We will continue sampling data until we have 
enough to determine when Alice is connecting to 
the hidden service as Node 1 in the circuit to- 
wards RP, at which point we will know the Hidden 
Server’s IP address. 

Our attack description is obviously based on 
hidden services as deployed on Tor; however, the 
basic approach will identify a client of a low- 
latency, free -route anonymity network, not just hid- 
den severs using Tor. These attacks should work on 
networks such as Freedom 00 or Crowds G3, de- 
spite their many differences from Tor. For systems 
such as Web MIXes 0, it is difficult to briefly say 
anything about either what a hidden service design 
over such a system would look like or about the 
relation to our attacks. On the one hand, becom- 
ing a node in the network is tightly controlled, and 
all circuits are through cascades (shared uniform 
fixed routes). Thus, our attacks would simply not 
be possible. On the other hand, much of the point 
of the attacks is to determine the point where con- 
nections enter and leave the anonymity network. In 
Web MIXes, this information is given, so there is 
no need to attack to obtain it. 

4.1 Experimental Setup 

Our experiments were conducted using two dif- 
ferent hidden services running at client nodes con- 
necting to the Tor network, one in Europe and one 
in the US. The services offered a couple of web 
pages and images, which were pulled down in dif- 
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ferent ways and with different timing patterns. The 
documents and images varied in size from 2KB to 
120KB. The connection from the Client to the Hid- 
den Server was done through a random Rendezvous 
Point (cf. Section 14. 6> . and the connection from 
the Client to the Rendezvous Point was shortened 
down to a path length of one. (This will be de- 
scribed more fully presently). 

The Hidden Service was not offered, or known 
to, any other node in the network except the direc- 
tory service. Only the Client knew about how to 
contact the service, so that all contact to and from 
the Hidden Server was either caused by the Client, 
or by the Hidden Server preparing to operate (mak- 
ing circuits and downloading new updates from the 
Directory Servers). This is not a limitation on the 
implications of the experimental results for pub- 
licly known and accessed hidden services: the tim- 
ings of data are done with high enough precision 
so the possibility of two identical patterns from the 
same service routing through the same node at the 
exact same time is negligible^ No hidden server is 
likely to get two specially designed requests (like 
ours) from distinct clients and respond to them at 
the exact same time. Thus a false positive in our 
timing analysis is highly unlikely (Section l4.2t . 

The Client computer was also announced as a 
middleman node, i.e. not having connections out 
of the anonymity network, and this node is where 
all Alice’s sampling of data takes place. By using 
the node both as a server inside the network and as 
the Client asking for the web pages from the Hid- 
den Server, the attacker is able to get precise timing 
without having to externally synchronize the time 
with another node. This server node in the Tor net- 
work had to use a logging mechanism when sam- 
pling the active circuits during the attacks. In order 
to avoid reference to the correct IP address during 
the timing analysis we converted the IP addresses 
by use of a simple prefix preserving scheme. If we 
were to use permanent logging of data, we would 


4 One reason for not doing the experiment on a publicly 
known server in the Tor network, is of course the possible le- 
gal implications. In addition, not wanting to cause harm to 
the project and its participants, we avoided announcements until 
there were countermeasures available and deployed. 


use a better and more secure pseudonomizing IP 
logging scheme EH- 

The attacker must also make some minor 
changes to the application code running at the 
Client node in order to enable and strengthen the 
attacks: 

• Alice’s Client will connect directly, i.e. in one 
hop, to the Rendezvous Point to shorten the 
path and latency of traffic between the Client 
and the Hidden Server, thereby making it eas- 
ier to set up and correlate the traffic patterns. 

• Alice’s Client will tear down the circuit to a 
Hidden Server after each pattern is success- 
fully communicated. This will disable reuse 
of circuits and force the construction of a new 
circuit on the next connection request. 

• In addition to being the Client, Alice is also 
running as a server middleman node partici- 
pating in the network and carrying traffic for 
the other nodes. She will maintain a list of 
active circuits (routed connections) and try to 
correlate the generated circuit data with all the 
other circuits to find out if she is carrying the 
same traffic data as both Client and as a server 
node. 

• Alice’s server node will report a false higher 
uptime and the maximum network bandwidth 
to the directory server in order for other nodes 
to trust it for their circuits. This is still possible 
as there is (yet) no method for keeping reliable 
track of uptime at the different servers. 

Once this is implemented, Alice is ready to use 
the methods of attack described below. 

4.2 Timing Analysis 

The attacker uses the logged timing data and di- 
rection information from the generated data set and 
the sampled data set (from each circuit active in that 
period of time) to accomplish two different things: 

1. Positively identify that Alice’s node is made a 
part of a circuit; and 
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Figure 3. Example of data sets, match- 
ing and response times 


2. If (1) is true; determine at which position in 
the circuit she is located. 

To identify if the generated data is found within 
one of the sampled data sets, Alice is faced with a 
comparison of one sampled data set, the generated 
set done by the Client, to all of the sampled data 
sets done by the server. For all connections to the 
Hidden Server there were a few hundred circuits 
active at Alice’s Tor node during each of the sample 
periods. 

Our match confirmation is an extended version 
of the packet counting attack described by Serjan- 
tov and Sewell 1241 . In addition to basic counting 
of cells, we also make use of precise timing infor- 
mation of when cells were received and transmit- 
ted, and the direction of each individual cell pass- 
ing in order to determine a circuit match. An exam- 
ple is depicted in Fig.0 Alice uses the direction of 
traffic in addition to the timing of the cells’ arrival 
to match all outgoing and incoming traffic in the 
generated data set. Notice that there is also noise 
occurring in the sampled data set. We compare our 
known data to one other specific set at a time, and 
our algorithm only checks if the generated data may 
be a part of the sampled data. Therefore, it makes 
no estimate of how probable the match is. 

Alice is also able to separate when there is a sin- 
gle match and when there are multiple matches in 
a data set. There is a potential for multiple matches 
in a set, for example, if a circuit is carrying lots of 
traffic in both directions, we will probably have a 
timing “match” due to the data load. In the attack 
Alice knows that only one attack circuit is set up 


at a time, and each attack circuit is set up for only 
Alice’s requests at that time. So, she can use the 
amount of traffic relayed through the attack circuit 
as a parameter. The small overhead and extra infor- 
mation from setting up the tunnelled connections 
etc., should not be more than a few cells, some- 
thing our experiments confirm. Therefore the at- 
tacker may discard the samples that are more than 
a few percent larger than the generated set. 

Multiple matches could also be a possible result 
in a future scenario where the circuits may be used 
to carry data for different clients. In this case the 
attacker must try to retrieve a list of all possible 
matches of the generated data within the sampled 
set and should then be able to use correlation tech- 
niques on the timing data to calculate a probability 
of the best match. We have not tested this part as 
it would require major changes in functionality for 
the deployed Tor network. 

4.3 Service Location Attack 

First we look at two different situations, the 
Server Scenario and the Client Scenario, based on 
whether the hidden service is located on a node 
within the anonymity network, or on a client us- 
ing the network but not participating as a network 
node. 

The Client Scenario is most often used when it is 
desired that HS not be listed in the directory service 
as a participating server node of the network. An- 
other reason for this scenario is that the user may be 
unable to set up a node directly reachable from the 
Internet (e.g., it must be located behind a firewall) 
but still wants to offer his service. 

The Server Scenario is most often used to hide 
the service traffic within all the other traffic running 
through the server node. This is often regarded as 
a reasonable effort to improve the cover of traffic 
originating at the node. 

A problem with the Server Scenario is that it is 
possible to correlate information about availability 
of a service and the availability of the nodes listed 
in the Directory Service. E.g. we poll each listed 

5 Actually the overhead is normally less than 10 cells, but the 
extra margin has not given any false positives yet. 
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Table 1. Experimental results of our attacks. 



Sample 

Time to 

Circuits 

Matched 

Largest 

Second 


time 

first match 

completed 

circuits 

single IP 

largest 

Server 1 

7.8h 

15 min 

676 

37 

46% 

5% 

Server 1 

6.8h 

3 min 

432 

26 

54% 

7% 

Server 2 

4.9h 

28 min 

447 

31 

71% 

3% 

Server 2 

10. 6h 

3 min 

990 

56 

54% 

7% 


server every five minutes and correlate the lists of 
active servers when we are able and unable to con- 
tact the hidden service. 

Our attack is based on the availability of a di- 
rectory service containing a public list of all server 
nodes in the anonymity network. From this list the 
attacker will immediately be able to tell the differ- 
ence between connections from mere clients and 
connections from network nodes. If a service is lo- 
cated at a client outside the anonymizing network, 
Alice will know both this and the client’s IP ad- 
dress, as soon as she has a positive match in the 
timing analysis (Section f4.2> on a connection orig- 
inating from outside of the currently listed server 
nodes. There is no other way the hidden service can 
communicate with the attacker’s node from outside 
the Tor network unless this is the actual location of 
the hidden service, or its point of contact, e.g., a 
firewall hiding internal addresses, an IPSec tunnel 
endpoint, etc. 

Experimental Results: Our results confirmed 

this by a simple matching of the IP addresses in the 
sampled circuits against the list of known servers. 
Both of our Hidden Servers were run on Client 
nodes and were easily confirmed as the source of 
the service. The time for the attack until success- 
ful identification of the IP address in the four tests 
of the experiment are shown in Table 0 under the 
column ’’Time to first match”. 

So, if the hidden service is located at a client 
of the anonymity network an attacker will find it 
in a matter of minutes using only one node in the 
network, but if the service is located at a network 
node we will have to use another method. 


4.4 The Predecessor Attack 

The current implementation of the network is 
vulnerable to the predecessor attack ED- This is 
a form of intersection attack. Since intersection at- 
tacks treat the intervening anonymity network as a 
black box, they are a threat to any anonymity net- 
work. Like other intersection attacks, the predeces- 
sor attack has been shown to be devastating in the- 
ory and simulation against various anonymity net- 
works but has never before been demonstrated on a 
live network. Roughly, the predecessor attack looks 
at repeated connections suspected to be to (from) 
the same correspondent and looks at intersections 
of predecessor nodes to see which occurs most of- 
ten. Our use of this attack is based on the assump- 
tion that the attacker is able to positively identify 
the actual streams of data to and from the client in 
other circuits, e.g. by using the Timing Analysis 
described in Section l4~2l 

In the case of Hidden Servers and using our sce- 
nario of attack, the Predecessor Attack becomes 
trivial. Alice can now make statistics of the IP ad- 
dresses that contacted the server in the cases where 
a positive traffic-pattern match was found. By se- 
lecting only circuits where there has been a match, 
and using an rn node path towards RP, one single 
IP address will occur in around — of these connec- 

m 

tions when HS is selecting its first node. The at- 
tacker will then easily identify the IP address of the 
Hidden Server as long as m is significantly smaller 
than the number of nodes in the network. 

Experimental Results: Our results confirmed 

the theory of the predecessor paper ED- Sorting 
out the possible circuits based on timing informa- 
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tion and then statistically examining the connecting 
IP addresses we immediately found the expected 
results from the predecessor attack. In every tesfj 
we found that around 50%, or more, of all connec- 
tions identified as a part of the stream were made 
from a single IP address, as shown under “Largest 
single IP” in Table flfl 

From the experimental results we can also con- 
clude that we need far less data to pinpoint the lo- 
cation of the Hidden Server than we gathered. A 
rough estimate is that within the order of an hour or 
two we should have a positive match of the location 
of the hidden service using the predecessor attack. 

4.5 Distance Attack 

If there is no information in IP address statistics 
(e.g., due to mixing of traffic or using other coun- 
termeasures), an attacker must use other techniques 
to locate the hidden service. 

When Alice has a dataset that matches the gen- 
erated set of data, she can look at the response 
times in the communication with the service. The 
attacker times the periods where the sampled data 
switches from outgoing to incoming traffic, round 
trip time, enabling the calculation of a rough esti- 
mate measuring the distance to the Hidden Server. 
These periods are marked in the example in Fig.0 
with D for the Client’s response times and d for the 
round trip time at the participating node. By group- 
ing nodes based on measured round trip times, the 
attacker is able to find some groups of nodes closer 
to the Hidden Server than others. 

Experimental Results: Our results confirmed 

the assumptions of the distance attack. Using the 
data from the experiment we could see a clear cor- 
relation between the response times and the dis- 
tance from the Hidden Server. When the Hidden 
Server was local it was of course easy to find a 

6 This result is for every test running without the use of 
“helper” guard nodes. Cf., Section|^] 

7 Given three nodes between HS and RP, we would expect 
to find a common predecessor IP address in only about 33% 
of matching connections. The discrepancy is due to an imple- 
mentation feature of Tor uncovered by our experiments. Cf., 
Section P. 1 1 
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Figure 4. Average round trip times at 
seven locations in a sample of circuits 


match showing the attacker’s node next to the ser- 
vice (order of 100-1000 compared to the other con- 
nections). But even when our service was located 
on computers on the other side of the globe we 
could still statistically observe when Alice was con- 
necting directly to the Hidden Server. The round 
trip times were an order of two or more larger for 
the nodes not adjacent to the Hidden Server, as 
shown in Fig. [4] The lower line represents the av- 
erage response times for 52 samples of the nodes 
closest to the Hidden Server, and the upper line is 
for the other 35 samples in our set where Alice is 
located at Node 2 in Fig. Q Due to the previously 
mentioned implementation feature of Tor we were 
unable to find data when Alice is located as Node 3, 
cf. Section l5~TI 

4.6 Owning the Rendezvous Point 

By extending adversary resources and using two 
nodes in the network, it is possible for Alice to run 
attacks where she owns the Rendezvous Point. This 
will significantly enhance the attack. 

Only knowing RP’s IP address will give the 
attacker knowledge of when she is the last node 
(Node 3 in Fig.Q in the circuit out from the Hidden 
Server. Selection of the Rendezvous Point is done 
by the Client and enables Alice to choose one of 
her nodes as RP, while still leaving her other node 
free to be chosen by HS for the circuit to RP. This 
allows Alice to tell when she is the second to last 
node in the circuit as well (since both C and RP are 
connected to the same node). This implies that if 
the path length is three before connecting to HS (as 
currently implemented) the attacker is able to deter- 
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mine the instance where she is Node 1, thus directly 
revealing the IP address of the Hidden Server. The 
speed and accuracy of the attack is then greatly 
improved, and the result will be as fast as in the 
Service Location Attack — except that this own-the- 
rendezvous attack will identify services located at 
network servers as well as those located at clients. 

5 Countermeasures: More Hidden 

Services 

5.1 Allowing Middleman nodes to 
connect to Rendezvous Points 

This first point is really about an implementation 
feature of Tor’s hidden services that facilitates our 
attacks rather than a limitation of the hidden ser- 
vices system design. But since changing the feature 
does slow down the attacks, we list it here. 

To save time, all Tor clients (including hidden 
servers) establish circuits offline, i.e., while await- 
ing service requests. Upon receiving a rendezvous 
request and an RP location, HS extends such cir- 
cuits to RP. Tor clients not operating as hidden ser- 
vices typically will need circuits that terminate at 
nodes that allow exit from the Tor network on com- 
mon ports, such as those for http, ssh, and https. 
Almost all of the new ’’stand-by” circuits estab- 
lished thus go to a node that allows such exit, which 
seems quite reasonable considering normal client 
use. A hidden server should similarly always have 
at least one circuit available at a random node of the 
network ready for connection to the Rendezvous 
Point. 

This creates an advantage for our attacker. By 
running in middleman mode (never allowing cir- 
cuits to exit the Tor network at that node) she both 
reduces the overhead of running a node and guar- 
antees that whenever her network node is used be- 
tween HS and RP, it will almost always be in the 
first or second position, which increases the effi- 
ciency of her attack. Our experiments uncovered 
this undocumented feature of the Tor implementa- 
tion. It is a trivial change to allow the third node 

8 We had no occurrence of being Node 3 in the sample sets 
described in this paper. 


from HS to RP to be any node not just an exit node. 
This has now been implemented by the Tor devel- 
opers and is available in the latest versions. 

5.2 Dummy traffic 

In anonymous communication, dummy traffic 
is a countermeasure to traffic analysis that is of- 
ten initially suggested. However, dummy traffic 
is expensive, and, despite research, it has yet to 
be shown that dummy traffic defeats any active 
attacks on low-latency systems unless the system 
will also bring most or all of the network to a 
stop in response to one non-sending client (as in 
Pipenet |T0j). Since this makes it trivial for any 
user to bring down the network, it is generally seen 
as a price few would pay for anonymity, which 
means that even those who would pay it would 
be hiding in a very small anonymity set turn. 
While some dummy traffic schemes have been pro- 
posed □ na, that attempt to address some active 
attacks, no fielded low-latency systems currently 
use dummy traffic. In light of our attacks, we de- 
scribe why dummy traffic would be an especially 
ineffective countermeasure for hidden services. 

In our attack scenario, Alice can develop a list of 
candidate circuits by labeling any circuits through 
her network node that show a response from the 
server shortly after she sends a request with an 
RP address to HS. This would potentially include 
many false positives. She can then induce a tim- 
ing signature in her network node on all responses 
from a server on candidate circuits. This can be 
done exactly in the manner of the timing signa- 
ture used by Murdoch and Danezis m , except that 
our attacks do not require collusion by the exter- 
nal server. Alice’s client then simply looks for the 
same timing signature. The important thing to note 
is that no dummy traffic scheme could prevent this. 
If dummy traffic is sent all the way to the client, 
Alice can of course detect it since she controls the 
client. If dummy traffic is sent by HS to some node 
between Alice’s node and Alice’s client, this will 
result in some differences between the induced sig- 
nature and the one seen by the client. Nonetheless, 
for low-latency traffic this strong signature would 
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clearly remain easily identifiable. In the experi- 
ments of ML the corrupt server would send for 
between 10 and 25 seconds then stop sending for 
between 30 and 75 seconds. This was detected in- 
directly in those experiments by interference with 
external probes of Tor nodes. In our case, Alice 
would have direct access to the circuits. This also 
means that the attack would scale to current net- 
work sizes and beyond^ Note also that no existing 
dummy scheme would even affect the signature of 
traffic sent from Alice’s client to the Hidden Server 
through Alice’s node. While this traffic is typically 
much lower volume it can still be used to identify 
the circuit. 

5.3 Extending the Path from Hid- 
den Server to Rendezvous Point 

As described in Section 14.61 and illustrated in 
Fig-0 if Alice owns at least two nodes she can have 
her client name one of them as the Rendezvous 
Point. If her other node is chosen by HS as Node 2, 
then she will be able to confirm this immediately 
with high confidence because both her nodes will 
be connected to Node 3. And as before, if Alice 
were connected as Node 3, then she would also 
know this. This means that Alice can easily know 
when she has a circuit match being Node 1, which 
is especially significant if HS is configured as in the 
server scenario of Section IP However, this also 
means that Alice can more quickly abandon circuits 
when she does not have Node 1 position, speeding 
up the attack. It also allows rapid identification of 
guard nodes (cf., Section l5~4l . 

A simple countermeasure to this is to allow HS 
to extend the path length, l, to RP by one hop. The 
attacker owning the Rendezvous Point will now be 
able to determine when she is located as Node 3 or 
Node 4, but unable to differentiate between the po- 
sitions 1 and 2, forcing Alice to use the predecessor 
or service location attack. Extending the path will 
also slow down the predecessor attack and the tim- 

9 The attacks of fill required probing the entire network and 
were done on a network an order of magnitude smaller than the 
current Tor network. It is an open question whether they would 
scale to the current one. 


Hidden Rendezvous 



Figure 5. Use of Entry Guard Nodes 


ing analysis by a factor of l/l since that is the fre- 
quency with which Alice’s node will be chosen as 
Node 1 within the matching circuits. So this coun- 
termeasure only causes a minor effect on the speed 
of the predecessor attack, and has no effect on the 
location attack. 

As an alternative, we could allow HS to choose 
RP. This would be a minor code change to the Tor 
hidden services protocol. Whether adding a node 
before the Rendezvous Point or allowing HS to 
choose RP, this would also seem to imply a longer 
path between client and HS than the current default 
Tor HS protocol, i.e., seven nodes vs. six. This 
would also create an easier attack if our techniques 
were used to locate the client, which would then 
require its own countermeasure. 

5.4 Using Entry Guard Nodes 

All of our attacks rely on Alice being able to 
force the Hidden Server to create new circuits until 
she can cause it to create a circuit that first connects 
directly to her node. What if this could never hap- 
pen, or if the rotation of first nodes in the circuit 
were slowed down? This would prevent or sub- 
stantially slow our attacks. This is the motivation 
behind entry guard nodes (or simply entry gu ards) 
a concept introduced by Wright et al. l30lP°l That 
work looked at attacks on various anonymous com- 
munication systems, but it did not consider at all 
the specific concerns of hidden services. The basic 
idea of a helper node in 1 30J was to always choose a 
single node as the first node in a communication. If 
this is compromised, then that end of your circuit is 

l°Wright et al. named these nodes helper nodes, which we 
have found to be a too general expression. 
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Table 2. Experimental results when Hidden Server is using Entry Guard Nodes. 



Total circuits 

Matched 

Largest 

Second 

Third 


completed 

circuits 

single IP 

largest 

largest 

Test 1 

292 

8 

7 

1 

0 

Test 2 

106 

6 

5 

1 

0 

Test 3 

296 

13 

12 

1 

0 

Test 4 

292 

10 

4 

3 

3 


always compromised. However, if it is not compro- 
mised, then the attacks we have described cannot 
work because Alice will never own the node adja- 
cent to HS on the rendezvous circuit. (For a circuit 
initiator to better hide the responder, Wright et al. 
also considered helper nodes as the last node in the 
circuit as well as the first.) 

Tor design has long allowed a specified list of 
entry nodes (and exit nodes) which, when speci- 
fied, will require all circuits to enter (resp. exit) the 
network through nodes in that set, as illustrated in 
Fig -13 This can be set as either a preference request 
or as a strict requirement ESI. The effectiveness of 
using these as entry guard nodes to counter prede- 
cessor attacks is noted by the Tor designers as an 
open research problem mE We now explore the 
idea of using entry guard nodes specifically to im- 
prove the protection of hidden servers. 

There are several parameters and options possi- 
ble in choosing entry guard nodes. The first param- 
eter is the entry guard set size, i.e, the number of 
nodes that HS will use as entry guards. The smaller 
the set, the less risk that Alice owns a node in it; 
however the greater the chance that all the nodes in 
the set will be subjected to monitoring or unavail- 
able from either failure or attack. As already noted, 
entry guard nodes may be preferred or strictly re- 
quired. As a refinement, there may be a succes- 
sively preferred set of entry guards. There may be 
a single layer of entry guard nodes, i.e., nodes cho- 
sen to be immediately adjacent to HS, or they may 
be layered, i.e., some guard nodes are chosen to 
be used for the first hop, some for the second, and 
possibly further. Finally, they may be chosen at 

1 1 It is also possible to set an exit node preference in the URL 
for specific HTTP requests 1271 


random or chosen based on trust or performance. 

Each of these choices is orthogonal, so each of 
the combinations of choice will lead to systems 
with different properties. For space, we will limit 
discussion to some of the more salient combina- 
tions. 

Choosing a small set of entry guard nodes that 
are both permanent and strictly required could lead 
to a higher percentage of service failures, either 
by accident or by design (assuming a very power- 
ful adversary, with the capability to DoS all entry 
guard nodes). If a permanent set is simply a prefer- 
ence, then DoS of all entry guard nodes could lead 
back to our attacks if the guard nodes can be kept 
down long enough. Of course this assumes that en- 
try guards can be identified by the attacker. We ran 
our attacks on a hidden server that had chosen three 
entry guard nodes. 

Experiment - Attacking Entry Guard Nodes: 

Letting the Hidden Service use three permanent, 
preferred entry guards we found that these nodes 
combined represented all identified connections 
through Alice’s node, as shown in Table [2] A 
quite unexpected result, but caused by the imple- 
mentation feature in Tor described earlier: we were 
never Node 3, only Node 2 (Node 1 being the entry 
guard). 

As in our previous experiments, identifying the 
entry guard nodes through our attacks never took 
more than a few hours. 

Backup guard nodes: Suppose there is a short 

list of entry guard nodes that is preferred (e.g., three 
nodes) and a longer list of guard nodes to be used 
as backup (e.g., nine) if those are not available. If 
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track of when a node enters the set and prefers to 
choose entry nodes for circuits that were in the set 
sooner. Nodes are only deleted from the set when 
they have been unreachable for an extended period 
(currently one month). 


Figure 6. Use of Layered Entry Guard 
Nodes 


the adversary has the capability of keeping say four 
nodes at a time offline, then it can cause HS to use 
other nodes in the Node 1 position than those on the 
short list. But, all that will accomplish is causing 
HS to rotate to three new nodes from the longer list 
as primary guards. Alice can cause rotation of cir- 
cuits but only through a still relatively small set of 
entry guard nodes, and this only through sustained 
attacking. We can however make it more difficult 
for Alice to find the entry guard nodes at all via our 
attacks. 

Layering guard nodes: Suppose, e.g., that HS 

has a set of three entry guard nodes from which to 
choose Node 1, and for each of these has a set of 
three guard nodes from which to choose Node 2, 
as illustrated in Fig. [6] As before, Alice can only 
successfully find HS if she owns one of these three 
layer 1 entry guard nodes. But if she does not, in 
order to even identify one of those layer 1 nodes to 
attack she must own one of the three layer 2 guard 
nodes associated with that node. 

Layering guard nodes would require much more 
substantial changes to the Tor code than the exper- 
iments we have already run, albeit probably fairly 
straightforward changes. We have thus not had a 
chance to conduct experiments on either layering 
or backup guard node configurations. However, a 
version of backup guard nodes has recently been 
implemented in Tor in response to our results. At 
the time of writing, by default each client running 
the latest Tor code chooses three nodes as initial 
preferred entry guards. When the available entry 
guard set shrinks below two nodes, two more nodes 
are added to the set. However, the software keeps 


Nonrandom choice of entry guard node sets: 

To avoid circuit rotation simply from failed entry 
guard nodes it might seem that it is best to choose 
as guard nodes those that have the best uptime, 
and perhaps bandwidth. This is, however, subject 
to abuse since adversaries may run highly reliable, 
highly performing nodes in order to increase their 
chances of being chosen as entry guard nodes. And, 
this is especially easy to abuse in the current Tor 
directory statistics in which nodes report their own 
performance. This is a specific instance of a more 
general problem in trying to build reliable anony- 
mous communication. One possible solution is to 
order node performance and reliability but then to 
choose from a large enough set in this order that 
the adversary is unlikely to be able to substantially 
alter the chances of being chosen as an entry guard 
node. Dingledine and Syverson described this strat- 
egy to form a reliable anonymous communication 
network of mix cascades O- 

Another possibility is to choose entry guard 
nodes based on trust in the node administrator. It 
is difficult to attach probabilities to Alice’s being 
trusted by the Hidden Server administrator, or per- 
haps more likely, to compromise a node run by 
someone trusted by the Hidden Server administra- 
tor. (Trust in honesty should not be confused with 
trust in competence.) Perhaps a greater concern 
is that common properties of the administrators of 
chosen entry guard nodes (e.g., they are all family 
relatives) may lead an adversary to form a hypoth- 
esis of who is running HS, which may then lead to 
attacks unrelated to use of the network per se. Here 
the layering approach described above may prove 
useful. If the layer 1 nodes are personally trusted, 
and the layer 2 nodes are chosen as random sets, 
then it becomes more difficult for an adversary to 
discover the set of entry guard nodes and thus to 
correlate external properties. 
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6 Conclusion 

Our results show that Tor’s location-hidden 
servers are not really hidden — or rather they were 
not really hidden prior to the recent introduction 
of guard nodes as countermeasures to our attacks. 
Using our attacks, all an attacker needed was one 
compromised node in the network and the “Hidden 
Server” was identified. 

We have demonstrated that an attack with one 
compromised node in the anonymity network takes 
only minutes if the service is located at a client, or a 
couple of hours when located on a server node. By 
using two nodes in the network it only takes min- 
utes to find the Hidden Server regardless of where 
it is located. 

We have also argued that neither dummy traf- 
fic nor extending the path length from the Hidden 
Server to the Rendezvous Point will protect against 
all of our attacks. However, requiring hidden ser- 
vices to always use entry guard nodes, which are 
currently available as a general option in the Tor 
code, greatly reduces the probability of successful 
attacks against a hidden service. 

Using random entry guard nodes may still leave 
the Hidden Server vulnerable to our attacks if the 
attacker is powerful enough to completely deny ser- 
vice to a small sets of nodes or to compromise them 
by physical or other means. But, using backup 
guard nodes and/or layering guard nodes will sig- 
nificantly slow down even such an attacker. 

Using random selection of backup and layering 
entry guard nodes will be an improvement, but as in 
all Tor circuits, someone connecting through ran- 
dom nodes will always be compromised if an at- 
tacker owns just two nodes m. Using the backup 
and layering techniques in combination with a non- 
random selection, e.g. based on some kind of trust, 
or experience, with the nodes, may slow the attack 
even more or may even prevent it entirely. 

We have demonstrated attacks that surprisingly 
require just one or two hostile nodes. What is pos- 
sible by an adversary that controls several nodes, 
or even, e.g., two percent of the network? We will 
investigate this in future work. Other future work 
includes testing implemented countermeasures for 


vulnerabilities using the described attack scenarios, 
as well as new ones. We will also be investigat- 
ing improved performance by shrinking the path 
length between the Client and the Hidden Server. 
We speculate that it may be possible to do so with 
adequate security when using our suggested coun- 
termeasures, and possibly others. We will also turn 
our attack on its head: testing to locate a client by 
having an attractive hidden service. 
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